Seminar 1. EEG analysis#

Open In Colab

Plan

  1. Read and visualize the data

  2. Preprocess the data

  3. Use ICA for noise reduction

  4. Compute ERP and plot topomaps for ERP

  5. Compute beta band envelopes for ERP

  6. Compute coherence

Part 1#

All preprocessing and some data analysis of EEG data can be done using the Python library MNE.

# For Colab only
# !pip install mne
import warnings
warnings.filterwarnings("ignore")

import mne
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import mne_connectivity
# not in colab
%matplotlib notebook
# in colab
# %matplotlib inline

mne.io includes the funtions for different EEG-record formats: https://mne.tools/stable/documentation/implementation.html#supported-data-formats

We will work with data for one patient from EEG Motor Movement/Imagery Dataset.

# !wget "https://www.physionet.org/files/eegmmidb/1.0.0/S003/S003R03.edf"
# !wget "https://www.physionet.org/files/eegmmidb/1.0.0/S003/S003R03.edf.event"
# !ls
sample = mne.io.read_raw_edf('S003R03.edf', verbose=False, preload=True)

Get some info about a record

sample.info
General
Measurement date August 12, 2009 16:15:00 GMT
Experimenter Unknown
Participant X
Channels
Digitized points Not available
Good channels 64 EEG
Bad channels None
EOG channels Not available
ECG channels Not available
Data
Sampling frequency 160.00 Hz
Highpass 0.00 Hz
Lowpass 80.00 Hz
# Sampling frequency
sample.info['sfreq']
160.0
# Length in seconds
len(sample) / sample.info['sfreq']
125.0
# Number of channels
len(sample.ch_names)
64

Channel selection and adding a montage#

sample.ch_names[:3]
['Fc5.', 'Fc3.', 'Fc1.']
# fix trailing dots in channel names
# use sample.rename_channels(map)

# YOUR CODE HERE
sample.ch_names[:3]
['Fc5.', 'Fc3.', 'Fc1.']
# 19 channels from International 10-20 system. no A1 and A2 here
# Be careful. Pure 10-20 labeling differs from high-resolution montages
# In MNE, 10-20 montage is actually an extended high-resulution version of 10-20
# FYI, mapping from pure 10-20 to high-resolution versions
# T3 = T7
# T4 = T8
# T5 = P7
# T6 = P8

channels_to_use = [
    # prefrontal
    'Fp1',
    'Fp2',
    # frontal
    'F7',
    'F3',
    'F4',
    'Fz',
    'F8',
    # central and temporal
    'T7',
    'C3',
    'Cz',
    'C4',
    'T8',
    # parietal
    'P7',
    'P3',
    'Pz',
    'P4',
    'P8',
    # occipital
    'O1',
    'O2',
]
sample_1020 = sample.copy().pick(channels_to_use)

# check that everything is OK
assert len(channels_to_use) == len(sample_1020.ch_names)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
C:\Users\ALEXAN~1\AppData\Local\Temp/ipykernel_5460/3434496327.py in <module>
----> 1 sample_1020 = sample.copy().pick(channels_to_use)
      2 
      3 # check that everything is OK
      4 assert len(channels_to_use) == len(sample_1020.ch_names)

<decorator-gen-131> in pick(self, picks, exclude, verbose)

~\anaconda3\lib\site-packages\mne\channels\channels.py in pick(self, picks, exclude, verbose)
    501             The modified instance.
    502         """
--> 503         picks = _picks_to_idx(self.info, picks, "all", exclude, allow_empty=False)
    504         self._pick_drop_channels(picks)
    505 

~\anaconda3\lib\site-packages\mne\_fiff\pick.py in _picks_to_idx(info, picks, none, exclude, allow_empty, with_ref_meg, return_kind, picks_on)
   1253         raise ValueError(f"picks must be 1D, got {picks.ndim}D")
   1254     if picks.dtype.char in ("S", "U"):
-> 1255         picks = _picks_str_to_idx(
   1256             info,
   1257             picks,

~\anaconda3\lib\site-packages\mne\_fiff\pick.py in _picks_str_to_idx(info, picks, exclude, with_ref_meg, return_kind, extra_repr, allow_empty, orig_picks)
   1390     if sum(any_found) == 0:
   1391         if not allow_empty:
-> 1392             raise ValueError(
   1393                 f"picks ({repr(orig_picks) + extra_repr}) could not be interpreted as "
   1394                 f'channel names (no channel "{str(bad_names)}"), channel types (no type'

ValueError: picks (['Fp1', 'Fp2', 'F7', 'F3', 'F4', 'Fz', 'F8', 'T7', 'C3', 'Cz', 'C4', 'T8', 'P7', 'P3', 'Pz', 'P4', 'P8', 'O1', 'O2']) could not be interpreted as channel names (no channel "['Fp1', 'Fp2', 'F7', 'F3', 'F4', 'Fz', 'F8', 'T7', 'C3', 'Cz', 'C4', 'T8', 'P7', 'P3', 'Pz', 'P4', 'P8', 'O1', 'O2']"), channel types (no type "Fp1" present), or a generic type (just "all" or "data")
ten_twenty_montage = mne.channels.make_standard_montage('standard_1020')
len(ten_twenty_montage.ch_names)
94
sample_1020.set_montage(ten_twenty_montage)
General
Measurement date August 12, 2009 16:15:00 GMT
Experimenter Unknown
Participant X
Channels
Digitized points 22 points
Good channels 19 EEG
Bad channels None
EOG channels Not available
ECG channels Not available
Data
Sampling frequency 160.00 Hz
Highpass 0.00 Hz
Lowpass 80.00 Hz
Filenames S003R03.edf
Duration 00:02:05 (HH:MM:SS)
sample_1020.plot_sensors(show_names=True);

Explore the signals#

sample_1020.compute_psd().plot();
Effective window size : 12.800 (s)
Plotting power spectral density (dB=True).

Do you see peaks connected to power line noise?

The notch filter’s purpose is to filter out activity at a specific frequency (rather than a frequency range). Because the alternating current in standard electric outlets in North America oscillates at 60 Hz, electric fields produced by the 60-Hz activity in the environment that surrounds us in our indoor environments frequently contaminates the EEG. Sixty-hertz notch filters (filters designed specifically to filter out 60-Hz activity) are used to attenuate or eliminate this unwanted signal. In countries where line frequencies are 50 Hz, 50-Hz notch filters are used for the same purpose.

Band-pass filtering#

It’s better to remove low-freq components < 1 Hz and high-freq > 50Hz (non-informative for EEG)

Let’s use IIR filter.

sample_1020.filter(l_freq=1, h_freq=50, method='iir')
Filtering raw data in 1 contiguous segment
Setting up band-pass filter from 1 - 50 Hz

IIR filter parameters
---------------------
Butterworth bandpass zero-phase (two-pass forward and reverse) non-causal filter:
- Filter order 16 (effective, after forward-backward)
- Cutoffs at 1.00, 50.00 Hz: -6.02, -6.02 dB
General
Measurement date August 12, 2009 16:15:00 GMT
Experimenter Unknown
Participant X
Channels
Digitized points 22 points
Good channels 19 EEG
Bad channels None
EOG channels Not available
ECG channels Not available
Data
Sampling frequency 160.00 Hz
Highpass 1.00 Hz
Lowpass 50.00 Hz
Filenames S003R03.edf
Duration 00:02:05 (HH:MM:SS)
# Plot psd after filtering

# YOUR CODE HERE

Plot EEG signals#

sample_1020.plot(n_channels=8, duration=20);
Using matplotlib as 2D backend.
# Plot in better scale. Use 'scalings' argument

# YOUR CODE HERE

Extracting events#

Mne has several functions for event selection.

  • mne.find_events is used when events are stored in trigger channels (e.g. FIFF format)

  • mne.events_from_annotations is used for when events are stored in annotations (EDF+ format)

Look for documentation for your EEG-record format

Here we have EDF+ format

events, events_dict = mne.events_from_annotations(sample_1020)
Used Annotations descriptions: ['T0', 'T1', 'T2']
events_dict
{'T0': 1, 'T1': 2, 'T2': 3}
events[:5]
array([[   0,    0,    1],
       [ 672,    0,    3],
       [1328,    0,    1],
       [2000,    0,    2],
       [2656,    0,    1]])

Epochs objects are a data structure for representing and analyzing equal-duration chunks of the EEG/MEG signal. Epochs are most often used to represent data that is time-locked to repeated experimental events. The Raw object and the events array are the bare minimum needed to create an Epochs object, which we create with the mne.Epochs class constructor.

However, you will almost surely want to change some of the other default parameters. Here we’ll change tmin and tmax (the time relative to each event at which to start and end each epoch).

epochs = mne.Epochs(sample_1020, events,  tmin=-0.5, tmax=0.8, preload=True)
Not setting metadata
30 matching events found
Setting baseline interval to [-0.5, 0.0] s
Applying baseline correction (mode: mean)
0 projection items activated
Using data from preloaded Raw for 30 events and 209 original time points ...
1 bad epochs dropped
pd.DataFrame(epochs.events, columns=['_', '__', 'event_id'])['event_id'].value_counts()
event_id
1    14
3     8
2     7
Name: count, dtype: int64

Check that length is right

for epoch in epochs:
    break
epoch.shape
(19, 209)
epoch.shape[1] / sample_1020.info['sfreq']
1.30625
sample_1020.to_data_frame().shape
(20000, 20)
df = epochs.to_data_frame()
df.head(3).iloc[:, :10]
time condition epoch Fp1 Fp2 F7 F3 F4 Fz F8
0 -0.50000 3 1 236.675891 244.275703 125.936913 88.829681 112.445086 73.438632 145.805904
1 -0.49375 3 1 175.970007 183.425016 103.803572 63.009365 82.687750 48.730906 115.749079
2 -0.48750 3 1 127.095181 132.985975 61.669835 31.478465 62.299189 33.384794 93.870855
df[sample_1020.ch_names + ['epoch']].groupby('epoch').agg(lambda arr: arr.max() - arr.min()).hist(figsize=[10, 10]);
plt.tight_layout()

Note also that the Epochs constructor accepts parameters reject for rejecting individual epochs based on signal amplitude.

epochs = mne.Epochs(sample_1020, events,  tmin=-0.5, tmax=0.8, reject={'eeg': 600e-6}, preload=True, baseline=(-.1, 0))
Not setting metadata
30 matching events found
Applying baseline correction (mode: mean)
0 projection items activated
Using data from preloaded Raw for 30 events and 209 original time points ...
    Rejecting  epoch based on EEG : ['Fp1', 'Fp2']
    Rejecting  epoch based on EEG : ['Fp2']
    Rejecting  epoch based on EEG : ['Fp2']
    Rejecting  epoch based on EEG : ['Fp2']
    Rejecting  epoch based on EEG : ['Fp2']
    Rejecting  epoch based on EEG : ['Fp1', 'Fp2']
    Rejecting  epoch based on EEG : ['Fp1', 'Fp2']
    Rejecting  epoch based on EEG : ['Fp1', 'Fp2']
    Rejecting  epoch based on EEG : ['Fp1', 'Fp2']
    Rejecting  epoch based on EEG : ['Fp1', 'Fp2']
    Rejecting  epoch based on EEG : ['Fp2']
    Rejecting  epoch based on EEG : ['Fp1', 'Fp2']
    Rejecting  epoch based on EEG : ['Fp1', 'Fp2']
    Rejecting  epoch based on EEG : ['Fp1', 'Fp2']
15 bad epochs dropped

PSD on epochs differs from the raw. More averaging is used

epochs.plot_psd();
NOTE: plot_psd() is a legacy function. New code should use .compute_psd().plot().
    Using multitaper spectrum estimation with 7 DPSS windows
Plotting power spectral density (dB=True).
Averaging across epochs...
epochs.plot(n_channels=8, scalings={'eeg':3e-4});
epochs.event_id
{'1': 1, '2': 2, '3': 3}
# check number of events of each type
# use epochs.events

# Your code here
evoked_T0 = epochs['1'].average()
evoked_T1 = epochs['2'].average()
evoked_T2 = epochs['3'].average()
evoked_T0.plot(spatial_colors=True);
evoked_T1.plot(spatial_colors=True);
evoked_T2.plot(spatial_colors=True);

Part 2#

Independent Component Analysis for Artifact Removal#

ica = mne.preprocessing.ICA(n_components=10, random_state=42)
ica.fit(sample_1020)
Fitting ICA to data using 19 channels (please be patient, this may take a while)
Selecting by number: 10 components
Fitting ICA took 0.4s.
Method fastica
Fit parameters algorithm=parallel
fun=logcosh
fun_args=None
max_iter=1000
Fit 17 iterations on raw data (20000 samples)
ICA components 10
Available PCA components 19
Channel types eeg
ICA components marked for exclusion
ica.plot_sources(sample_1020);
Creating RawArray with float64 data, n_channels=10, n_times=20000
    Range : 0 ... 19999 =      0.000 ...   124.994 secs
Ready.
ica.plot_components();

Inspect ICA components more deeply. Check out spectrogram. Segments info is not very relevant here since we build ICA on the raw data

We expect to see alpha and beta rythms picks on the spectrogram for good components (7-13 Hz and 13-30Hz respectively). And also slight decrease as frequency goes higher

ica.plot_properties(sample_1020, picks=[4]);
    Using multitaper spectrum estimation with 7 DPSS windows
Not setting metadata
62 matching events found
No baseline correction applied
0 projection items activated
ica.plot_overlay(sample_1020, exclude=[0, 1, 4, 5, 8, 9], picks=['F3']);
Applying ICA to Raw instance
    Transforming to ICA space (10 components)
    Zeroing out 6 ICA components
    Projecting back using 19 PCA components
ica.exclude = [0, 1]
sample_1020_clr = sample_1020.copy()
ica.apply(sample_1020_clr)
Applying ICA to Raw instance
    Transforming to ICA space (10 components)
    Zeroing out 2 ICA components
    Projecting back using 19 PCA components
General
Measurement date August 12, 2009 16:15:00 GMT
Experimenter Unknown
Participant X
Channels
Digitized points 22 points
Good channels 19 EEG
Bad channels None
EOG channels Not available
ECG channels Not available
Data
Sampling frequency 160.00 Hz
Highpass 1.00 Hz
Lowpass 50.00 Hz
Filenames S003R03.edf
Duration 00:02:05 (HH:MM:SS)
# plot channels

# YOUR CODE HERE
epochs = mne.Epochs(sample_1020_clr, events,  tmin=-0.5, tmax=0.8, reject={'eeg': 600e-6}, preload=True, baseline=(-.1, 0))
Not setting metadata
30 matching events found
Applying baseline correction (mode: mean)
0 projection items activated
Using data from preloaded Raw for 30 events and 209 original time points ...
1 bad epochs dropped
evoked_T0 = epochs['1'].average()
evoked_T1 = epochs['2'].average()
evoked_T2 = epochs['3'].average()
evoked_T0.plot(spatial_colors=True);
evoked_T1.plot(spatial_colors=True);
evoked_T2.plot(spatial_colors=True);
evoked_T0.plot_topomap(times=[0, .2, .4, .6, .8], vlim=(-50,50));
evoked_T1.plot_topomap(times=[0, .2, .4, .6, .8], vlim=(-50,50));
evoked_T2.plot_topomap(times=[0, .2, .4, .6, .8], vlim=(-50,50));

Dynamics of alpha and beta activity#

evoked_T0_alpha = evoked_T0.copy().filter(l_freq=7, h_freq=13, method='iir', verbose=False).apply_hilbert(envelope=True)
evoked_T1_alpha = evoked_T1.copy().filter(l_freq=7, h_freq=13, method='iir', verbose=False).apply_hilbert(envelope=True)
evoked_T2_alpha = evoked_T2.copy().filter(l_freq=7, h_freq=13, method='iir', verbose=False).apply_hilbert(envelope=True)
evoked_T1_alpha.plot(spatial_colors=True);
evoked_T0_alpha.plot_topomap(times=[0, .1, .2, .3, .4, .6], vlim=(0,30));
evoked_T1_alpha.plot_topomap(times=[0, .1, .2, .3, .4, .6], vlim=(0,30));
evoked_T2_alpha.plot_topomap(times=[0, .1, .2, .3, .4, .6], vlim=(0,30));
evoked_T0_beta_low = evoked_T0.copy().filter(l_freq=13, h_freq=20, method='iir', verbose=False).apply_hilbert(envelope=True)
evoked_T1_beta_low = evoked_T1.copy().filter(l_freq=13, h_freq=20, method='iir', verbose=False).apply_hilbert(envelope=True)
evoked_T2_beta_low = evoked_T2.copy().filter(l_freq=13, h_freq=20, method='iir', verbose=False).apply_hilbert(envelope=True)
evoked_T0_beta_low.plot_topomap(times=[0, .2, .4, .6, .8], vlim=(0,30));
evoked_T1_beta_low.plot_topomap(times=[0, .2, .4, .6, .8], vlim=(0,30));
evoked_T2_beta_low.plot_topomap(times=[0, .2, .4, .6, .8], vlim=(0,30));

Computing functional connectivity#

conn_T1 = mne_connectivity.spectral_connectivity_epochs(epochs['2'], method='coh')
Adding metadata with 3 columns
Connectivity computation...
only using indices for lower-triangular matrix
    computing connectivity for 171 connections
    using t=-0.500s..0.800s for estimation (209 points)
    frequencies: 4.6Hz..79.6Hz (99 points)
    Using multitaper spectrum estimation with 7 DPSS windows
    the following metrics will be computed: Coherence
    computing cross-spectral density for epoch 1
    computing cross-spectral density for epoch 2
    computing cross-spectral density for epoch 3
    computing cross-spectral density for epoch 4
    computing cross-spectral density for epoch 5
    computing cross-spectral density for epoch 6
    computing cross-spectral density for epoch 7
    assembling connectivity matrix
[Connectivity computation done]
def plot_topomap_connectivity_2d(info, con, picks=None, pairs=None, vmin=None, vmax=None, cm=None, show_values=False, show_names=True):
    """
    Plots connectivity-like data in 2d
    
    Drawing every pair of channels will likely make a mess
    There are two options to avoid it:
    - provide picks
    - provide specific pairs of channels to draw
    """
    
    # get positions
    _, pos, _, _, _, _, _ = mne.viz.topomap._prepare_topomap_plot(info, 'eeg');
    
#     if picks is None and pairs is None:
#         picks = info.ch_names
    
    ch_names_lower = [ch.lower() for ch in info.ch_names]
    if picks:
        picks_lower = [ch.lower() for ch in picks]
    if pairs:
        pairs_lower = [tuple(sorted([ch1.lower(), ch2.lower()])) for ch1, ch2 in pairs]
    
    rows = []
    for idx1, ch1 in enumerate(ch_names_lower):
        for idx2, ch2 in enumerate(ch_names_lower):
            if ch1 >= ch2:
                continue
            if picks and (ch1 not in picks_lower or ch2 not in picks_lower):
                    continue
            if pairs and (ch1, ch2) not in pairs_lower:
                    continue
            rows.append((
                pos[idx1],
                pos[idx2],
                con[idx1, idx2]
            ))
    
    if not len(rows):
        raise ValueError('No pairs to plot')
    
    con_to_plot = np.array([row[2] for row in rows])
    if vmin is None:
        vmin = np.percentile(con_to_plot, 2)
    if vmax is None:
        vmax = np.percentile(con_to_plot, 98)
    norm = matplotlib.colors.Normalize(vmin=vmin, vmax=vmax)
    
    if cm is None:
        cm = sns.diverging_palette(240, 10, as_cmap=True)
    
    fig, ax = plt.subplots(figsize=[5, 5])
    mne.viz.utils.plot_sensors(info, show_names=show_names, show=False, axes=ax);
    for row in rows:
        rgba_color = cm(norm(row[2]))
        plt.plot([row[0][0], row[1][0]], [row[0][1], row[1][1]], color=rgba_color)
        if show_values:
            plt.text((row[0][0] + row[1][0]) / 2, 
                     (row[0][1] + row[1][1]) / 2, 
                     '{:.2f}'.format(row[2]))
conn_T0 = mne_connectivity.spectral_connectivity_epochs(epochs['1'], method='coh', verbose=False);
conn_T1 = mne_connectivity.spectral_connectivity_epochs(epochs['2'], method='coh', verbose=False);
conn_T2 = mne_connectivity.spectral_connectivity_epochs(epochs['3'], method='coh', verbose=False);
conn_T0.freqs
[4.593301435406698,
 5.358851674641148,
 6.124401913875597,
 6.8899521531100465,
 7.655502392344497,
 8.421052631578947,
 9.186602870813395,
 9.952153110047846,
 10.717703349282296,
 11.483253588516744,
 12.248803827751194,
 13.014354066985645,
 13.779904306220093,
 14.545454545454543,
 15.311004784688993,
 16.076555023923444,
 16.842105263157894,
 17.60765550239234,
 18.37320574162679,
 19.13875598086124,
 19.90430622009569,
 20.66985645933014,
 21.43540669856459,
 22.200956937799038,
 22.96650717703349,
 23.73205741626794,
 24.49760765550239,
 25.26315789473684,
 26.02870813397129,
 26.79425837320574,
 27.559808612440186,
 28.325358851674636,
 29.090909090909086,
 29.856459330143537,
 30.622009569377987,
 31.387559808612437,
 32.15311004784689,
 32.91866028708134,
 33.68421052631579,
 34.44976076555023,
 35.21531100478468,
 35.98086124401913,
 36.74641148325358,
 37.51196172248803,
 38.27751196172248,
 39.04306220095693,
 39.80861244019138,
 40.57416267942583,
 41.33971291866028,
 42.10526315789473,
 42.87081339712918,
 43.63636363636363,
 44.401913875598076,
 45.167464114832526,
 45.93301435406698,
 46.69856459330143,
 47.46411483253588,
 48.22966507177033,
 48.99521531100478,
 49.76076555023923,
 50.52631578947368,
 51.29186602870813,
 52.05741626794258,
 52.82296650717703,
 53.58851674641148,
 54.35406698564592,
 55.11961722488037,
 55.88516746411482,
 56.65071770334927,
 57.41626794258372,
 58.18181818181817,
 58.94736842105262,
 59.71291866028707,
 60.47846889952152,
 61.244019138755974,
 62.009569377990424,
 62.775119617224874,
 63.540669856459324,
 64.30622009569377,
 65.07177033492822,
 65.83732057416267,
 66.60287081339712,
 67.36842105263158,
 68.13397129186602,
 68.89952153110046,
 69.66507177033492,
 70.43062200956936,
 71.19617224880382,
 71.96172248803826,
 72.72727272727272,
 73.49282296650716,
 74.25837320574162,
 75.02392344497606,
 75.78947368421052,
 76.55502392344496,
 77.32057416267942,
 78.08612440191386,
 78.8516746411483,
 79.61722488038276]
conn_T0_beta = conn_T0.get_data(output="dense")[:, :, 12:27].mean(axis=2)
conn_T0_beta = conn_T0_beta + conn_T0_beta.T

conn_T1_beta = conn_T1.get_data(output="dense")[:, :, 12:27].mean(axis=2)
conn_T1_beta = conn_T1_beta + conn_T1_beta.T

conn_T2_beta = conn_T2.get_data(output="dense")[:, :, 12:27].mean(axis=2)
conn_T2_beta = conn_T2_beta + conn_T2_beta.T
plot_topomap_connectivity_2d(epochs.info, conn_T1_beta, picks=epochs.ch_names);
plot_topomap_connectivity_2d(epochs.info, conn_T0_beta, 
                             pairs=[('F7', 'F4'), ('O2', 'T7'), ('C3', 'C4'), ('P7', 'P8'), ('F8', 'T8'), ('O1', 'O2'), ('O1', 'P4')],
                             show_values=True,
                             show_names=False
                            
                            );
# calculate coherence in alpha band
# plot 5-10 pairs that you are interested in

# YOUR CODE HERE